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ABSTRACT 


A  plausible  s-factor  solution  for  many  types  of  psychological  and  educa¬ 
tional  tests  is  one  in  which  there  is  one  general  factor  and  s  —  1  group  or 
method  related  factors.  The  bi-factor  solution  results  from  the  constraint  that 
each  item  has  a  non-zero  loading  on  the  primary  dimension  aji  and  at  most 
one  of  the  s  —  1  group  factors.  This  structure  has  been  termed' the  “bi-factor” 
solution  by  Holzinger  k  Swineford,  but  it  also  appears  in  the  work  of  Tucker 
and  Joreskog.  All  attempts  at  estimating  the  parameters  of  this  model  have 
been  restricted  to  continuously  measured  variables;  it  has  not  been  previously 
considered  in  the  conte.xt  of  item-response  theory  (IRT).  It  is  conceivable,  how¬ 
ever,  that  the  bi-<Actor  structure  might  arise  in  IRT  related  problems. 

The  purpose  of  this  paper  is  to  derive  a  bi-factor  item-response  model  for 
binary  response  data,  and  to  develop  a  corresponding  method  of  parameter 
estimation.  This  restriction  leads  to  a  major  simplification  of  the  likelihood 
equations  that  (1)  permits  the  statistical  evaluation  of  problems  of  unlimited 
dimensionality,  (2)  permits  'onditional  dependence  among  discrete  and  previ¬ 
ously  identified  subsets  of  items,  and  (.3)  in  some  cases  provides  more  parsimo¬ 
nious  factor  solutions  than  an  unrestricted  full-information  item  factor  analysis 
might  provide  {e.g.,  Bock  and  Aitkin,  1981); 
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1  Introduction 

Consider  the  case  in  which,  for  n  variables,  an  s-factor  solution  exists  in  which 
there  is  one  general  factor  and  s  —  1  group  or  method  related  factors.  The  bi¬ 
factor  solution  constrains  each  item  to  have  a  non-zero  loading  on  the  primary 
dimension  Qji  and  on  not  more  than  one  of  the  s  —  1  group  factors  (i.e., 
ajh,  h  =  2,...,s).  For  four  items,  the  factor-pattern  matrix  might  be 

■  an  q;i2  0 

<^21  ^22  0 
=  n 

0^31  U  0^33 

.  0^41  0  a43  . 

This  structure  has  been  termed  the  “bi-factor”  solution  by  Holzinger  Sz 
Swineford  (1937),  inter-battery  factor  analysis  by  Tucker  (1958),  and  is  also 
one  of  the  confirmatory  factor  analysis  models  considered  by  Joreskog  (1969). 
In  these  applications,  the  model  is  restricted  to  test  scores,  assumed  to  be  con¬ 
tinuously  distributed.  It  is  easy,  however  to  conceive  of  situations  where  the 
bi-factor  pattern  might  arise  at  the  item  level.  It  is  plausible  for  paragraph 
comprehension  tests,  for  exa,mple,  in  which  case  the  primary  dimension  de¬ 
scribes  the  targeted  aptitude  and  the  additional  factors  describe  knovv ledge  of 
the  content  area  within  the  paragraphs.  In  this  context,  items  would  be  condi¬ 
tionally  independent  between  paragraphs,  but  conditionally  dependent  within 
specific  paragraphs.'  . 

The  purpose  of  this  paperjs  to  derive  an  item-.response  model  for  binary  re¬ 
sponse  data  that  exhibit  the  bi-factor  structure  and  to  develop  a  corresponding 
method  of  parameter  estimation.  Of  course,  other  types  of  tests  that  consist 
of  items  tapping  different  content  areas  would  also  be  suitable  for  this  type  of 
analysis.  As  we  will'show,  this  restriction  leads  to  a  major  simplification  of  the 
likelihood  equations  that  (1)  permits  the  statistical  evaluation  of  problems  of 
unlimited  dimensionality,  (2)  permits  conditional  dependence  among  discrete 
and  previously  identified  subsets  of  items,  and  (3)  in  some  cases  provides  more 
parsimonious  factor  solutions  than  an  unrestricted  full-information  item  factor 
analysis  might  provide  (e.^.,  Bock  arid  Aitkin,  1981).  In  the  following  sections, 
we  derive  the  likelihood  and  its  first  derivatives  so  that  an  EM  solution  to  item 
bi-factor  analysis  may  be  obtained. 

2  Likelihood  Evaluation 

Stuart  (1958)  showed  that  if  n  variables  follow  a  standardized  multivariate,  nor¬ 
mal  distribution  where  the  correlation  pij  =  Ylh=\  and  ath  is  nonzero  for 
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only  one  /i,  then  the  probability  that  the  respective  variables  are  simultaneously 
less  than  7j  is, 


=  n/: 


A=r 
where 


n/i 


n  ^((7i  -otjhy)/{i  - 

.j=i 


f{y)dy 


(1) 


/(!)■= 

m  =  r  M'U 

and  Uh  is  the  number  of  items  loading  on  dimension  h  (/i  =  1, . . . ,  s). 

Equation  (1)  follows  from  the  fact  that  if  each  variate  is  related  to  only  a 
single  dimension,  then  the  s  dimensions  are  independent,  and  the  joint  prob¬ 
ability  is  simply  the  product  of  the  s  unidimensional  probabilities.  In  the 
present  context,  this  result  only  applies  to  the  s  -  1  “nuisance”  dimensions 
(i.c.,  /i  =  2, . . .  ,  s);  if  a  primary  dimension  exists,  it  will  not  be  independent  of 
the  other  s  -  1  dimensions.  To  compute  this  probability  therefore  requires  a 
two-dimensional  generalization  of  Stuart’s  (1958)  original  result. 

To  derive  the  two-dimensional  result,  we  begin  by  noting  that  the  proba¬ 
bility  of  the  primary  dimension  can  be  obtained  using  the  formula  of  Dunnett 
and  Sobel  (1955), 


n 


f{y)dy, 


(2) 


which  is  valid  as  long  as  pij  =  a,aj.  Of  course,  this  directly  implies  a  unidi¬ 
mensional  problem.  Combining  the  two  results  yields. 


TT  f  f  7j  <^jhy\ 

J=i  \  /. 


f{z)dz, 


(3) 


which  can  be  approximated  to  any  practical  degree  of  accuracy  using  Gauss- 
Hermite  quadrature  (Stroud  and  Sechrest,  1966).  What  is  important  about 
this  result  is,  if  the  assumptions  are  reasonable  (as  they  clearly  are  for  many 
IRT  applications),  then  the  probability  of  any  response  pattern  can  be  obtained 
by  a  two-dimensional  integration,  regardless  of  the  dimensionality  s. 

For  example,  if  yj  =  Ylh=i  oijh^h  +  sj  and  we  assume  that 
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y;  ~  iV(0,l), 

6  ~  iV(0,I),  and 

-  iV(o,  1  - 1;  cv;,), 

h=l 

then  the  unconditional  probability  of  observing  score  pattern  x  =  X(  is, 


/OO  I  ^  yCC 

n  /  -  F{6ueH)Y-^^^fieK)d9H 

Ia=2 


(4) 


which  can  be  approximated  by, 


A  =  £ 


{4  <3  nh  ^ 

n  E  A(.Y„), 

/i=2  [  ?/i  j=l  J  J 


(5) 


where 


F{X,,,XJ  =  F 


T/  I  9 1  ^  j/i  -^  ( 


9a 


\/l  -«il 


and  X,  and  /1(X,)  are  the  nodes  and  corresponding  weights  of  a  Gauss-Hermite 
quadrature. 


3  Marginal  Maximum  Likelihood  Estimation 

The  parameters  of  the  item  bi-factor  analysis  model  can  be  estimated  by  the 
method  of  marginal  maximum  likelihood  using  a  variation  of  the  approach 
described  by  Bock  k  Aitkin  (1981).  The  parameters  of  this  mode!  include  n 
“thresholds”  or  “intercepts”,  n  primary  factor  loadings  or  “slopes”  and  a  total 
of  n  factor  loadings  or  slopes  on  the  h  —  2. ...,s  additional  dimensions  (i.e., 
Ea=2^a  =  n).  The  likelihood  equations  are  derived  as  follows.  Let 
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and 


i?»(X)  =  'trtlEa,{X„)]La{X,„XJ/P,.  (13) 

/=1 

It  should  be  noted  that  these  equations  are  similar  to  those  in  the  unrestricted 
case,  except  that  in  the  bi-factor  case,  the  conditional  probability  of  response 
pattern  X(h  (I'.e.,  responses  to  items  j  in  subsection  h  for  response 

pattern  i)  is  weighted  by  the  factor,  E(h{Xq^).  Furthermore,  since  each  item 
only  appears  in  one  subsection  (A),  the  ff  now  vary  with  /i,  in  contreist  to 
the  unrestricted  case.  As  such,  the  denote  the  effective  sample  size  for 
subset  h  at  quadrature  point  When  weighted  by  A(X)  and  summed 

over  the  quadrature  nodes  for  each  subsection,  Rh  yields  the  total  number  of 
respondents,  whereas  the  corresponding  weighting  and  summation  for  fj  yields 
the  total  number  of  respondents  answering  item  j  correctly. 

From  provisional  parameter  values,  each  E-Step  yields  fj  and  Nh,  the  expec¬ 
tations  of  the  complete  data  statistics  computed  conditional  on  the  incomplete 
data  (see  Er-'l''  Gibbons,  &  Muraki,  1988).  The  subsequent  M-step  solves 
equation  (10)  using  conventional  maximum  likelihood  multiple  probit  analy¬ 
sis,  substituting  the  provisional  expectations  of  fj  and  (see  Bock  Sz  Jones, 
1968). 

4  Illustration 

To  illustrate  the  application  of  the  bi-factor  IRT  model,  we  have  evaluated  20 
items  selected  from  an  .ACT  natural  science  test,  for  a  random  sample  of  1000 
examinees  (we  are  indebted  to  Terry  .Ackerman  and  Mark  Reckase  for  these 
data).  This  test  involves  a  series  of  questions  regarding  each  of  four  paragraphs. 
For  the  purpose  of  this  illustration,  we  selected  the  first  5  items  from  each  of 
four  paragraphs. 

Table  1  displays  the  unrestricted  promax-rotated  -l-factor  solution,  which 
adequately  fit  these  data  (improvement  in  fit  of  a  four-factor  model  over  a 
three- factor  model  was  Xir  =  31.59,  p  <  .02;  the  improvement  in  fit  of  five 
factors  over  four  factors  was  not  significant  (xjg  =  18,44,  p  <  .30).  Inspection 
of  Table  1  reveals  that  each  factor  is  dominated  by  items  from  a  particular 
paragraph.  In  contrast,  the  estimated  factor  loadings  for  the  bi-factor  model 
(see  Table  2)  with  5  =  5  (i.e.,  one  primary  dimension  and  four  paragraph- 
specific  dimensions)  revealed  a  strong  general  ability  dimension,  as  well  as 
appreciable  within  paragraph  associations.  The  fit  of  the  restricted  model  was 
not  significantly  different  from  the  fit  of  either  the  four-factor  [x^  =  23;83,p  < 
.99)  or  the  five-factor  (xlo  =  43,22,  p  <  .95)  unrestricted  models.  Inspection 

y 
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P,  =  P(x  =  Xt) 


=  /  In/  nK(«)r''ii-w)r''''/{«».M»4/wM«i 

=  [  ill  f  i-M/mdhlmtdf,. 

1/1=2''®'*  J 


Then  the  log  likelihood  is, 


(6) 


log  £- =  53  r<  log  A  (7) 

<=i 

where  S  denotes  the  number  of  unique  response  patterns.  The  derivative  of 
the  log  marginal  likelihood  with  respect  to  a  general  item  parameter  uj  is  as 
follows. 

Let 


LMUhidh 

Then 

dhgL 

duj  yP(\di>jJ 


(8) 

(9) 


=  E-  / 
•/«! 


E(hm 


L(h{e)^f^m)d9h\ 


dvj 


J 


(10) 


Following  Bock  and  Aitkin  (1981),  the  marginal  likelihood  equations  can 
be  solved,  using  the  EM  algorithm  of  Dempster,  Laird  Sc  Rubin  (1977),  by 
replacing  the  integrals  with  Gauss-Hermite  quadratures  and  rearranging  terms 
into  the  two-dimensional  form: 


« « fi(x)  -  N.(mjm  fmm] 
ttr  fXXIll  -  f;(X-)|  \  9ri  J 

where 


A(XJA(X„), 


(11) 


f,.(X)  =  A',.)/P, 


i=l 


(12) 


0 


of  the  loadings  within  each  paragraph  reveals  that  the  intra-paragraph  item 
associations  are  quite  variable. 

As  a  computational  note,  we  should  point  out  that  the  numerical  precision 
of  the  bi-factor  solution  represents  a  major  improvement  over  the  unrestricted 
solution.  Given  that  the  bi-factor  solution  only  requires  approximation  of  a 
two-dimensional  integral,  we  were  able  to  use  100  quadrature  points  (t.e.,  10  in 
each  dimension)  instead  of  the  243  quadrature  points  used  in  the  unrestricted 
five  factor  solution,  (t.e.,  3  in  each  dimension).  Five  factors  probably  represents 
the  highest  dimensional  solution  that  is  computational  tractable  at  this  time. 
Parameters  of  the  unrestricted  models  were  estimated  using  the  TESTFACT 
program  (Wilson,  Wood  &  Gibbons,  1984). 

5  A  Simple  Structure  Model 

Consider  an  orthogonal  simple  structure  factor  model  in  which  each  item  loads 
on  one  and  only  one  of  s  dimensions.  This  satisfies  a  complete  simple  struc¬ 
ture  model  as  defined  by  Thurstone  (1947),  which  for  measurement  data  could 
be  evaluated  using  methods  for  confirmatory  factor  analysis  (Joreskog,  1969). 
This  is,  of  course,  a  simplification  of  the  bi-factor  model  in  which  there  is  no 
primary  dimension.  In  this  case,  the  unconditional  probability  in  (5)  is  reduced 
to  the  unidimensional  form, 

1  fQ  fnh  ) 

E{nw^..r'ii-F(A^.r''4^  (w) 

/»=!  l/=l  J 


where 


- 


7j 


that  is,  (5)  reduces  to  the  product  of  the  s  independent  unidimensional  prob¬ 
abilities.  The  likdihood  equations  in  (11)  can  then  be  approximated  by. 


aiogi  ...  A  -  iV,CA,JFi(.Y,.)  (dFAX„) 


and 


i»l.(-^«)  =  E’'<£/A(-V,.)/e*.  (IT) 

<=1 

In  this  case,  represents  the  constant 

th 


and 


/>( = n  'A 

A=1 

It  is  interesting  to  note  that  and  i%  now  only  contain  information  from 
the  specific  subset  of  items  (h)  for  which  item  j  is  a  member.  This  is,  of 
course,  due  to  the  independence  between  the  subsets  that  results  from  the 
simple  structure. 

Application  of  the  simple  structure  model  to  the  ACT  natural  science  test 
example  yields  the  item-parameters  displayed  in  Table  3.  Inspection  of  the 
pararheter  estimates  in  Table  3  reveals  that  removal  of  the  primary  factor  in¬ 
creases  the  magnitude  of  the: loadings  on  the  individual  paragraph  dimensions. 
In  terms  of  model  fit,  both  the  bi-factor  model  (xjo  =  336,  p  <  .0001)  and 
the  unrestricted  four-factor  model  (xgj  =  361,  p  <  .0001)  provide  significant 
improvements  in  fit  over  the  simple  structure  model,  indicating  that  the  test  is 
in  fact  measuring  a  primary  ability  dimension  and  not  merely  four  independent 
realms  of  knowledge. 


6  Discussion 

The  bi-factor  model  presented>here  provides  a  natural  alternative  to  the  tradi¬ 
tional  conditionally-independent  unidimensional  IRT  model.  When  potential 
sources  of  conditional  dependence  are  known  in  advance,  as  in  the  case  of 
paragraph  comprehension  tests  or  tests  in  which  two  or  more  methods  of  item 
presentation  are  involved,  the  item  bi-factor  solution  provides  an  excellent  al¬ 
ternative.  An  attractive  by-product  of  this  model  is  that  it  requires  only  the 
evaluation  of  a  two-dimensional  integral,  regardless  of  the  number  of  potential 
subtests,  paragraphs,  or  content  areas.  These  different  content  areas  are,  of 
course,  assumed  to  bedndependcnt  conditional  on  the  primary  ability  dimen¬ 
sion  that  the  test  was  designed  to  measure.  As  such,  the  limitations  on  the 
dimensionality  of  the  full-information  item  factor  analysis  model  embodied  in 
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the  TESTFACT  program  (Wilson,  Wood  k  Gibbons,  1984),  do  not  aoply.  Of 
course,  the  subsections  (e.^.,  paragraphs)  must  be  known  in  advance. 

In  certain  situations,  for  example  psychiatric  measurement  (Gibbons,  1985), 
the  existence  of  a  primary  dimension  {e.g.,  depression),  is  itself  at  question.  In 
this  case,  comparison  of  the  bi-factor  and  simple  factor  solutions  presented  here 
is  of  particular  interest.  Item  bi-factor  analysis  could  therefore  help  answer  the 
question  of  whether  depression  is  a  unitary  disorder  or  a  mixture  of  a  series  of 
qualitatively  distinct  abnormalities;  a  question  that  has  long  plagued  psychi¬ 
atric  researchers.  Comparison  of  the  fit  of  the  bi-factor  and  simple  structure 
models  provides  a  tool  for  investigating  such  problems  in  psychiatric  research 
and  other  areas  as  well. 

Finally,  those  cases  in  which  little  is  known  about  the  structure  of  a  partic¬ 
ular  test,  but  little  confidence  can  be  placed  in  the  assumption  of  conditional 
independence,  the  more  general  solution  presented  by  Gibbons  et.  al.  (1989), 
using  Clark’s  (1961)  formulae  for  the  moments  of  n  jointly  normal  variables, 
could  be  used.  This  procedure  uses  a  direct  approximation  to  the  multivariate 
normal  distribution  that  underlies  the  item-response  function,  without  restric¬ 
tions  on  the  form  of  the  inter-item-residual  covariances.  With  it,  the  assump¬ 
tion  of  conditional  independence  is  not  required.  Further  work  in  this  area  is 
underway. 
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Table  1 


Full-Information  Item  Factor  Analysis  -  Unrestricted  Promax  Solution 
ACT  Natural  Science  Test  -  20  items  and  1000  subjects 


Item 

7; 

Oij2 

0/4 

1 

-.215 

.401 

-.005 

-.036 

.216 

2 

-.385 

.185 

-.019 

-.007 

.105 

3 

-.356 

.667 

-.070 

-.081 

-.081 

4 

-.098 

.619 

.013 

.044 

-.022 

5 

-.029 

.562 

-.092 

-.059 

.119 

6 

-.582 

.129 

.068 

.256 

.030 

7 

-.585 

.184 

-.211 

.419 

.102 

S 

-.137 

-.037 

-.061 

.025 

.172 

9 

-.246 

.238 

.063 

.362 

-.284 

10 

-.089 

-.224 

.128 

.620 

.060 

11 

-.049 

.182 

.135 

-.0.34 

.311 

12 

-.407 

-.024 

-.065 

.124 

.320 

13 

-.265 

.247 

.082 

.020 

.173 

14 

-.051 

.137 

.005 

.007 

.585 

15 

.040 

.224 

.129 

-.045 

.295 

16 

.345 

.153 

.289 

-.122 

-.109 

17 

.167 

-.007 

.682 

.089 

-.044 

18 

.172 

-.096 

.520 

-.024 

.120 

19 

.543 

.008 

.500 

.067 

.091 

20 

.672 

-.073 

-.010 

.004 

.163 
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Table  2 


Full- Information  Item  Bi- Factor  Analysis 
ACT  NaturabScience  Test  -  20  items  and  1000  subjects 


Item 

7/ 

aji  aj2 

aj3 

1 

-.230 

.524  .129 

2 

-.392 

.232  .115 

3 

-.370 

.411  .427 

4 

-.118 

.548  .278 

5 

-.046 

.489  ..338 

6 

-.593 

.311 

.277 

7 

-.600 

.376 

.314 

S 

-.138 

.087 

-.019 

9 

-.259 

.207 

.390 

10 

-.103 

.226 

.476 

11 

-.062 

.484 

.141 

12 

-.413 

.261 

.135 

13 

-.277 

.423 

.199 

14 

-.066 

.573 

.187 

15 

.025 

.492 

.260 

16 

.340 

.1t12 

.261 

17 

.150 

.306 

.662 

IS 

.160 

.240 

.571 

19 

.528 

.340 

.493 

20 

.671 

.061 

.031 
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Table  3 


Full- Information  Simple  Structure  Item  Factor  Analysis 
ACT  Natural  Science  Test  -  20  items  and  1000  subjects 


Item 

7; 

aj2 

aj4 

1 

-.224 

.482 

2 

-.391 

.251 

3 

-.368 

.571 

4 

-.111 

.612 

5 

-.040 

.585 

6 

-.592 

.408 

7 

-.597 

.467 

8 

-.138 

.032 

9 

-.258 

.429 

10 

-.102 

.509 

11 

-.056 

.489 

12 

-.412 

.297 

13 

-.273 

.449 

14 

-.058 

.591 

15 

.031 

.566 

16 

.341 

.282 

17 

.157 

.732 

IS 

.163 

;616 

19 

.534 

.597 

20 

.671 

i057 
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